Tasks

  • Reproject world dataset to a global equal area projection
  • Write a parallel foreach() loop to identify the a spatial relationships of each country
  • Set the output of the foreach() funtion to return a simple matrix
  • Confirm that your parallel loop returns the same answer as a typical “sequential” approach

Background

The census data do not include specific addresses (the finest spatial information is the census block), so it’s common to see chloropleths representing the aggregate statistics of the underlying polygon. This is accurate, but not so personal.Folks at the University of Virginia developed a simple yet effective visualization approach, called the ‘Racial Dot Map’ which conveys a simple idea - one dot equals one person.

The idea is really simple, simply randomly generate a point for each person of each racial identity within each polygon. Can you do it? Can you do it using multiple cores on your computer?

library(tidyverse)
library(spData)
library(sf)

## New Packages
library(mapview) # new package that makes easy leaflet maps
library(foreach)
library(doParallel)
registerDoParallel(2)
getDoParWorkers() # check registered cores

Steps

Write an Rmd script that:

  • Downloads block-level data on population by race in each census block in Buffalo using get_dicennial() function of the tidycensus package. You can use the following code:
library(tidycensus)
racevars <- c(White = "P005003", 
              Black = "P005004", 
              Asian = "P005006", 
              Hispanic = "P004003")

options(tigris_use_cache = TRUE)
erie <- get_decennial(geography = "block", variables = racevars, 
                  state = "NY", county = "Erie County", geometry = TRUE,
                  summary_var = "P001001", cache_table=T) 
  • Crop the county-level data to c(xmin=-78.9,xmax=-78.85,ymin=42.888,ymax=42.92) to reduce the computational burdern. Feel free to enlarge this area if your computer is fast (or you are patient)
  • Write a foreach loop that does the following steps for each racial group in the variable column of the erie dataset and rbind the results.
    • filter the the data to include only one race at time
    • use st_sample() to generate random points for each person that resided within each polygon. You will have to set size=.$value. The . indicates that the column comes from the dataset that was passed to the function.
    • convert the points to spatial features with st_as_sf()
    • mutate to add a column named variable that is set to the current racial group (from the foreach loop)
  • Use the mapview() function in the mapview package to make a leaflet map of the dataset and set the zcol to the racial identity of each point. You can adjust any of the visualization parameters (such as cex for size).
## Warning: attribute variables are assumed to be spatially constant
## throughout all geometries

Your final result should look something like this:

Update the map to include:

  • Other racial groups
  • Adjust colors to match the original
  • Summarize the data in different ways